Maintaining Academic Integrity in Programming: Locality-Sensitive Hashing and Recommendations
نویسندگان
چکیده
Not many efficient similarity detectors are employed in practice to maintain academic integrity. Perhaps it is because they lack intuitive reports for investigation, only have a command line interface, and/or not publicly accessible. This paper presents SSTRANGE, an detector with locality-sensitive hashing (MinHash and Super-Bit). The tool features investigation graphical user interface. Further, accessible on GitHub. SSTRANGE was evaluated the SOCO dataset under two performance metrics: f-score processing time. evaluation shows that both MinHash Super-Bit more than their predecessors (Cosine Jaccard 60% less time) common measurement (running Karp-Rabin greedy string tiling 99% time). effectiveness trade-off still reasonable (no 24%). Higher can be obtained by tuning number of clusters stages. To encourage use automated detectors, we provide ten recommendations instructors interested employing such first These include consideration assessment design, irregular patterns similarity, multiple measurements, effectiveness–efficiency trade-off. based our 2.5-year experience (SSTRANGE’s predecessors) 13 course offerings various designs.
منابع مشابه
Beyond Locality-Sensitive Hashing
We present a new data structure for the c-approximate near neighbor problem (ANN) in the Euclidean space. For n points in R, our algorithm achieves Oc(n + d logn) query time and Oc(n + d logn) space, where ρ ≤ 7/(8c2) + O(1/c3) + oc(1). This is the first improvement over the result by Andoni and Indyk (FOCS 2006) and the first data structure that bypasses a locality-sensitive hashing lower boun...
متن کاملNon-Metric Locality-Sensitive Hashing
Non-metric distances are often more reasonable compared with metric ones in terms of consistency with human perceptions. However, existing locality-sensitive hashing (LSH) algorithms can only support data which are gauged with metrics. In this paper we propose a novel locality-sensitive hashing algorithm targeting such non-metric data. Data in original feature space are embedded into an implici...
متن کاملLocality-Sensitive Hashing of Curves
We study data structures for storing a set of polygonal curves in IR such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fréchet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact t...
متن کاملSuper-Bit Locality-Sensitive Hashing
Sign-random-projection locality-sensitive hashing (SRP-LSH) is a probabilistic dimension reduction method which provides an unbiased estimate of angular similarity, yet suffers from the large variance of its estimation. In this work, we propose the Super-Bit locality-sensitive hashing (SBLSH). It is easy to implement, which orthogonalizes the random projection vectors in batches, and it is theo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Education Sciences
سال: 2023
ISSN: ['2227-7102']
DOI: https://doi.org/10.3390/educsci13010054